2 research outputs found

    Detection of Outliers in Time Series Data

    Get PDF
    This thesis presents the detection of time series outliers. The data set used in this work is provided by the GasDay Project at Marquette University, which produces mathematical models to predict the consumption of natural gas for Local Distribution Companies (LDCs). Flow with no outliers is required to develop and train accurate models. GasDay is using statistical approaches motivated by normally distributed samples such as the 3 -sigma rule and the 5 -sigma rule to aid the experts in detecting outliers in residuals from the models. However, the Jarque-Bera statistical test shows that the residuals from the GasDay models are not normally distributed. We present an explanation of Density Based Spatial Clustering of Applications with Noise (DBSCAN) and how it is used to detect time series outliers. We have introduced a new application for the DBSCAN algorithm by adapting it to detect outliers in natural gas flow. The performance of DBSCAN is compared with GasDay\u27s existing technique. Five data sets from temperature-sensitive operating areas with identified outliers and 1000 data sets with synthetic outliers are used in the evaluation process. The 1000 synthetic data sets are prepared using the same empirical distribution as one of the identified data set. This work indicates that DBSCAN has shown some improvement in detecting outliers over GasDays existing technique and merits further exploration

    Bioinformatics Systems And Mathematical Models For Improved Understanding Of Malaria Transmission, Control, And Elimination

    Get PDF
    The leading malaria vector control strategies (i.e., long-lasting insecticidal nets and indoor residual spraying) can reduce indoor transmission, but these tools alone are insufficient to eliminate it. Strategies that target adult mosquitoes when they feed on humans or animals outdoors or target mosquito immature stages are also needed to achieve malaria elimination. Improved data systems for integrating diverse experimental observations and research groups, as well as process-explicit mathematical models for evaluating them are both essential to achieving these goals. We have developed a generic schema and data repositories for the studies of malaria vectors that encompass a wide variety of different experimental designs that rapidly generate large data volumes. We extended a malaria transmission model to examine the relationship between transmission, control, and the proportion of blood meals a vector population obtains from humans: Assuming the lower limit for this indicator of human feeding preference enabled derivation of simplified models for zoophagic vectors. We present differential equation models to describe the biological processes that mediate novel strategies to control malaria vectors by autodissemination of pyripoxyfen (PPF) as it is transferred from treated stations to the gravid mosquitoes and then to the aquatic habitats where it inhibits mosquito emergence. Data from most of the mosquito studies we reviewed conformed to our generic schema with four tables recording the experimental design, sorting of collections, details of samples, and additional observations. Our corresponding online repository includes 20 experiments, 8 projects, and 15 users at two institutes, resulting in 10 peer-reviewed publications. For zoophagic vectors, the results from model can be used to forecast the likely immediate and delayed impacts of an intervention using only three field-measurable parameters. For the autodissemination of PPF, sensitivity analysis indicates success of the strategy is plausible because the ≥ 80% coverage of aquatic habitats with PPF appears achievable with modest, biologically plausible values of field-measurable input parameters. Therefore, we have applied two of the computational sciences aspects (i.e., research data preparation using computer systems and scenario analysis with mathematical models) to address obstacles to the control and elimination of malaria
    corecore